5.12. Download the source file omp mat vect rand split.c from the book's website.
Find a program that does cache profiling (e.g., Valgrind [49]) and compile the program according to the instructions in the cache profiler documentation.(For example, with Valgrind you will want a symbol table and full optimization. (With gcc use, gcc -g -O2 . . .). Now run the program according to the instructions in the cache profiler documentation, using input k .k 106/,.k 103/ .k 103/, and .k 106/ k. Choose k so large that the number of level 2 cache misses is of the order 106 for at least one of the input sets of data.
a. How many level 1 cache write-misses occur with each of the three inputs?
b. How many level 2 cache write-misses occur with each of the three inputs?
c. Where do most of the write-misses occur? For which input data does the
program have the most write-misses? Can you explain why?
d. How many level 1 cache read-misses occur with each of the three inputs?
e. How many level 2 cache read-misses occur with each of the three inputs?
f. Where do most of the read-misses occur? For which input data does the
program have the most read-misses? Can you explain why?
g. Run the program with each of the three inputs, but without using the cache
profiler. With which input is the program the fastest? With which input is
the program the slowest? Can your observations about cache misses help
explain the differences? How?
 
 
View Solution
 
 
 
<< Back Next >>